The Kasabi data publishing platform created by Talis was announced
to be closing on July 30, 2012. While the service has only been around for ~2 years it represents a unique look at services for Linked Data, and contains a variety of datasets. In a subsequent post
Kasabi announced the availability of a spreadsheet that lists where datasets can be downloaded from Amazon S3. This spreadsheet has been uploaded to Internet Archive as datasets.csv
, and each referenced dataset has been uploaded as well.
The database snapshots are gzipped ntriples
and gzipped n-quads
. The nquads are different because they include a named graph URL for the dataset. For datasets that have a named graph URL the graph was downloaded, and saved using the dataset name with the ".nt" extension. For example prelinger-archives.gz has its named graph stored as prelinger-archives.nt.
The code used to download the datasets from s3 and upload to Internet Archive is available on GitHub